Overview

Dataset statistics

Number of variables10
Number of observations633
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory49.6 KiB
Average record size in memory80.2 B

Variable types

Numeric7
Categorical3

Warnings

Emp Name has a high cardinality: 624 distinct values High cardinality
Gross Load is highly correlated with Gross Load AdjustedHigh correlation
Gross Load Adjusted is highly correlated with Gross LoadHigh correlation
Emp Name is uniformly distributed Uniform
Emp Code has unique values Unique
New Customers has 176 (27.8%) zeros Zeros
New/Existing Customers' has 64 (10.1%) zeros Zeros
Inc. Gross Sales has 34 (5.4%) zeros Zeros
Gross Load has 56 (8.8%) zeros Zeros
Gross Load Adjusted has 19 (3.0%) zeros Zeros
Digital Activation Count - CF has 85 (13.4%) zeros Zeros

Reproduction

Analysis started2021-02-01 08:15:07.658284
Analysis finished2021-02-01 08:15:13.293153
Duration5.63 seconds
Software versionpandas-profiling v2.10.0
Download configurationconfig.yaml

Variables

Emp Code
Real number (ℝ≥0)

UNIQUE

Distinct633
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6157.672986
Minimum368
Maximum8488
Zeros0
Zeros (%)0.0%
Memory size5.1 KiB
2021-02-01T13:15:13.357014image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum368
5-th percentile873
Q15988
median6720
Q38168
95-th percentile8440.4
Maximum8488
Range8120
Interquartile range (IQR)2180

Descriptive statistics

Standard deviation2544.795403
Coefficient of variation (CV)0.4132722554
Kurtosis0.2793336296
Mean6157.672986
Median Absolute Deviation (MAD)1348
Skewness-1.285544972
Sum3897807
Variance6475983.641
MonotocityNot monotonic
2021-02-01T13:15:13.484672image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
81911
 
0.2%
6841
 
0.2%
84821
 
0.2%
84781
 
0.2%
84771
 
0.2%
62331
 
0.2%
84741
 
0.2%
64241
 
0.2%
84711
 
0.2%
84691
 
0.2%
Other values (623)623
98.4%
ValueCountFrequency (%)
3681
0.2%
5091
0.2%
5161
0.2%
5521
0.2%
5541
0.2%
ValueCountFrequency (%)
84881
0.2%
84841
0.2%
84821
0.2%
84781
0.2%
84771
0.2%

Emp Name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct624
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
Muhammad Usman
 
3
Ali Raza
 
3
Hassan Raza
 
2
Muhammad Ali
 
2
Muhammad Noman
 
2
Other values (619)
621 

Length

Max length31
Median length13
Mean length13.92733017
Min length5

Characters and Unicode

Total characters8816
Distinct characters50
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique617 ?
Unique (%)97.5%

Sample

1st rowShahid Mustafa
2nd rowHafiz Muhammad Omer Zia
3rd rowMohsin Arif
4th rowAgha Umaid Raza
5th rowAnum Tariq
ValueCountFrequency (%)
Muhammad Usman3
 
0.5%
Ali Raza3
 
0.5%
Hassan Raza2
 
0.3%
Muhammad Ali2
 
0.3%
Muhammad Noman2
 
0.3%
Muhammad Junaid2
 
0.3%
Abdullah2
 
0.3%
Ahsan Saleem1
 
0.2%
Chauhdary Muhammad Younas1
 
0.2%
Umer Farooq1
 
0.2%
Other values (614)614
97.0%
2021-02-01T13:15:13.770901image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
muhammad105
 
7.2%
ali58
 
4.0%
khan39
 
2.7%
syed32
 
2.2%
ahmed23
 
1.6%
hussain16
 
1.1%
hassan15
 
1.0%
usman13
 
0.9%
iqbal12
 
0.8%
raza12
 
0.8%
Other values (606)1132
77.7%

Most occurring characters

ValueCountFrequency (%)
a1593
18.1%
825
 
9.4%
i581
 
6.6%
h547
 
6.2%
m516
 
5.9%
d395
 
4.5%
n374
 
4.2%
e367
 
4.2%
r351
 
4.0%
u300
 
3.4%
Other values (40)2967
33.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter6538
74.2%
Uppercase Letter1453
 
16.5%
Space Separator825
 
9.4%

Most frequent character per category

ValueCountFrequency (%)
a1593
24.4%
i581
 
8.9%
h547
 
8.4%
m516
 
7.9%
d395
 
6.0%
n374
 
5.7%
e367
 
5.6%
r351
 
5.4%
u300
 
4.6%
s295
 
4.5%
Other values (15)1219
18.6%
ValueCountFrequency (%)
A287
19.8%
S221
15.2%
M175
12.0%
K93
 
6.4%
H88
 
6.1%
F72
 
5.0%
N71
 
4.9%
R64
 
4.4%
Z53
 
3.6%
U49
 
3.4%
Other values (14)280
19.3%
ValueCountFrequency (%)
825
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin7991
90.6%
Common825
 
9.4%

Most frequent character per script

ValueCountFrequency (%)
a1593
19.9%
i581
 
7.3%
h547
 
6.8%
m516
 
6.5%
d395
 
4.9%
n374
 
4.7%
e367
 
4.6%
r351
 
4.4%
u300
 
3.8%
s295
 
3.7%
Other values (39)2672
33.4%
ValueCountFrequency (%)
825
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII8816
100.0%

Most frequent character per block

ValueCountFrequency (%)
a1593
18.1%
825
 
9.4%
i581
 
6.6%
h547
 
6.2%
m516
 
5.9%
d395
 
4.5%
n374
 
4.2%
e367
 
4.2%
r351
 
4.0%
u300
 
3.4%
Other values (40)2967
33.7%

Designation
Categorical

Distinct3
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
Relationship Manager
302 
Assistant Sales Manager
209 
Sales Manager
122 

Length

Max length23
Median length20
Mean length19.64139021
Min length13

Characters and Unicode

Total characters12433
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSales Manager
2nd rowRelationship Manager
3rd rowSales Manager
4th rowAssistant Sales Manager
5th rowSales Manager
ValueCountFrequency (%)
Relationship Manager302
47.7%
Assistant Sales Manager209
33.0%
Sales Manager122
19.3%
2021-02-01T13:15:13.976323image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram of lengths of the category
2021-02-01T13:15:14.038188image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
ValueCountFrequency (%)
manager633
42.9%
sales331
22.4%
relationship302
20.5%
assistant209
 
14.2%

Most occurring characters

ValueCountFrequency (%)
a2108
17.0%
e1266
10.2%
s1260
10.1%
n1144
9.2%
842
 
6.8%
i813
 
6.5%
t720
 
5.8%
l633
 
5.1%
M633
 
5.1%
g633
 
5.1%
Other values (7)2381
19.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter10116
81.4%
Uppercase Letter1475
 
11.9%
Space Separator842
 
6.8%

Most frequent character per category

ValueCountFrequency (%)
a2108
20.8%
e1266
12.5%
s1260
12.5%
n1144
11.3%
i813
 
8.0%
t720
 
7.1%
l633
 
6.3%
g633
 
6.3%
r633
 
6.3%
o302
 
3.0%
Other values (2)604
 
6.0%
ValueCountFrequency (%)
M633
42.9%
S331
22.4%
R302
20.5%
A209
 
14.2%
ValueCountFrequency (%)
842
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin11591
93.2%
Common842
 
6.8%

Most frequent character per script

ValueCountFrequency (%)
a2108
18.2%
e1266
10.9%
s1260
10.9%
n1144
9.9%
i813
 
7.0%
t720
 
6.2%
l633
 
5.5%
M633
 
5.5%
g633
 
5.5%
r633
 
5.5%
Other values (6)1748
15.1%
ValueCountFrequency (%)
842
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII12433
100.0%

Most frequent character per block

ValueCountFrequency (%)
a2108
17.0%
e1266
10.2%
s1260
10.1%
n1144
9.2%
842
 
6.8%
i813
 
6.5%
t720
 
5.8%
l633
 
5.1%
M633
 
5.1%
g633
 
5.1%
Other values (7)2381
19.2%

Region
Categorical

Distinct5
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
Central
217 
South
185 
North
118 
KPK
62 
Multan
51 

Length

Max length7
Median length5
Mean length5.570300158
Min length3

Characters and Unicode

Total characters3526
Distinct characters15
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCentral
2nd rowCentral
3rd rowNorth
4th rowNorth
5th rowNorth
ValueCountFrequency (%)
Central217
34.3%
South185
29.2%
North118
18.6%
KPK62
 
9.8%
Multan51
 
8.1%
2021-02-01T13:15:14.251585image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram of lengths of the category
2021-02-01T13:15:14.324392image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
ValueCountFrequency (%)
central217
34.3%
south185
29.2%
north118
18.6%
kpk62
 
9.8%
multan51
 
8.1%

Most occurring characters

ValueCountFrequency (%)
t571
16.2%
r335
9.5%
o303
8.6%
h303
8.6%
n268
7.6%
a268
7.6%
l268
7.6%
u236
6.7%
C217
 
6.2%
e217
 
6.2%
Other values (5)540
15.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2769
78.5%
Uppercase Letter757
 
21.5%

Most frequent character per category

ValueCountFrequency (%)
t571
20.6%
r335
12.1%
o303
10.9%
h303
10.9%
n268
9.7%
a268
9.7%
l268
9.7%
u236
8.5%
e217
 
7.8%
ValueCountFrequency (%)
C217
28.7%
S185
24.4%
K124
16.4%
N118
15.6%
P62
 
8.2%
M51
 
6.7%

Most occurring scripts

ValueCountFrequency (%)
Latin3526
100.0%

Most frequent character per script

ValueCountFrequency (%)
t571
16.2%
r335
9.5%
o303
8.6%
h303
8.6%
n268
7.6%
a268
7.6%
l268
7.6%
u236
6.7%
C217
 
6.2%
e217
 
6.2%
Other values (5)540
15.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII3526
100.0%

Most frequent character per block

ValueCountFrequency (%)
t571
16.2%
r335
9.5%
o303
8.6%
h303
8.6%
n268
7.6%
a268
7.6%
l268
7.6%
u236
6.7%
C217
 
6.2%
e217
 
6.2%
Other values (5)540
15.3%

New Customers
Real number (ℝ≥0)

ZEROS

Distinct15
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.015797788
Minimum0
Maximum19
Zeros176
Zeros (%)27.8%
Memory size5.1 KiB
2021-02-01T13:15:14.406200image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q33
95-th percentile6
Maximum19
Range19
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.263441573
Coefficient of variation (CV)1.122851502
Kurtosis7.796467086
Mean2.015797788
Median Absolute Deviation (MAD)1
Skewness2.119844251
Sum1276
Variance5.123167757
MonotocityNot monotonic
2021-02-01T13:15:14.498950image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
0176
27.8%
1155
24.5%
2114
18.0%
368
 
10.7%
444
 
7.0%
525
 
3.9%
623
 
3.6%
710
 
1.6%
86
 
0.9%
95
 
0.8%
Other values (5)7
 
1.1%
ValueCountFrequency (%)
0176
27.8%
1155
24.5%
2114
18.0%
368
 
10.7%
444
 
7.0%
ValueCountFrequency (%)
191
 
0.2%
151
 
0.2%
121
 
0.2%
111
 
0.2%
103
0.5%

New/Existing Customers'
Real number (ℝ≥0)

ZEROS

Distinct18
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.791469194
Minimum0
Maximum22
Zeros64
Zeros (%)10.1%
Memory size5.1 KiB
2021-02-01T13:15:14.602673image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q35
95-th percentile11
Maximum22
Range22
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.285101085
Coefficient of variation (CV)0.866445411
Kurtosis3.7561154
Mean3.791469194
Median Absolute Deviation (MAD)2
Skewness1.579867222
Sum2400
Variance10.79188914
MonotocityNot monotonic
2021-02-01T13:15:14.793164image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
1107
16.9%
296
15.2%
386
13.6%
474
11.7%
064
10.1%
562
9.8%
641
 
6.5%
728
 
4.4%
822
 
3.5%
1113
 
2.1%
Other values (8)40
 
6.3%
ValueCountFrequency (%)
064
10.1%
1107
16.9%
296
15.2%
386
13.6%
474
11.7%
ValueCountFrequency (%)
221
 
0.2%
201
 
0.2%
192
 
0.3%
145
0.8%
136
0.9%

Inc. Gross Sales
Real number (ℝ)

ZEROS

Distinct565
Distinct (%)89.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4045572.907
Minimum-903987
Maximum46070446
Zeros34
Zeros (%)5.4%
Memory size5.1 KiB
2021-02-01T13:15:14.905860image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum-903987
5-th percentile0
Q1597505
median2127080
Q35002299
95-th percentile15352591.4
Maximum46070446
Range46974433
Interquartile range (IQR)4404794

Descriptive statistics

Standard deviation5803295.667
Coefficient of variation (CV)1.434480555
Kurtosis15.56383604
Mean4045572.907
Median Absolute Deviation (MAD)1827080
Skewness3.359102973
Sum2560847650
Variance3.36782406 × 1013
MonotocityNot monotonic
2021-02-01T13:15:15.033490image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
034
 
5.4%
3000008
 
1.3%
5000004
 
0.6%
500004
 
0.6%
50004
 
0.6%
10000003
 
0.5%
4000003
 
0.5%
1000003
 
0.5%
15000003
 
0.5%
11500002
 
0.3%
Other values (555)565
89.3%
ValueCountFrequency (%)
-9039871
0.2%
-8745991
0.2%
-7361591
0.2%
-5000001
0.2%
-3053951
0.2%
ValueCountFrequency (%)
460704461
0.2%
444196751
0.2%
425959931
0.2%
373704921
0.2%
356528131
0.2%

Gross Load
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct542
Distinct (%)85.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55141.20221
Minimum-65366
Maximum1298416
Zeros56
Zeros (%)8.8%
Memory size5.1 KiB
2021-02-01T13:15:15.160180image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum-65366
5-th percentile0
Q14972
median21429
Q357043
95-th percentile224864.4
Maximum1298416
Range1363782
Interquartile range (IQR)52071

Descriptive statistics

Standard deviation112550.9647
Coefficient of variation (CV)2.041140928
Kurtosis48.0680877
Mean55141.20221
Median Absolute Deviation (MAD)19948
Skewness5.868597777
Sum34904381
Variance1.266771965 × 1010
MonotocityNot monotonic
2021-02-01T13:15:15.283847image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
056
 
8.8%
9895
 
0.8%
4945
 
0.8%
494
 
0.6%
49444
 
0.6%
29664
 
0.6%
14512
 
0.3%
145082
 
0.3%
45002
 
0.3%
17802
 
0.3%
Other values (532)547
86.4%
ValueCountFrequency (%)
-653661
 
0.2%
-79621
 
0.2%
056
8.8%
494
 
0.6%
501
 
0.2%
ValueCountFrequency (%)
12984161
0.2%
11933791
0.2%
7704281
0.2%
7699351
0.2%
6944791
0.2%

Gross Load Adjusted
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct572
Distinct (%)90.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67400.28594
Minimum-52866
Maximum1315916
Zeros19
Zeros (%)3.0%
Memory size5.1 KiB
2021-02-01T13:15:15.421451image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum-52866
5-th percentile1470.2
Q112521
median34394
Q373265
95-th percentile239250.6
Maximum1315916
Range1368782
Interquartile range (IQR)60744

Descriptive statistics

Standard deviation115270.0783
Coefficient of variation (CV)1.710231294
Kurtosis44.14405023
Mean67400.28594
Median Absolute Deviation (MAD)26845
Skewness5.548201278
Sum42664381
Variance1.328719094 × 1010
MonotocityDecreasing
2021-02-01T13:15:15.545147image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
019
 
3.0%
250018
 
2.8%
50009
 
1.4%
75007
 
1.1%
159893
 
0.5%
49443
 
0.5%
200003
 
0.5%
54663
 
0.5%
75492
 
0.3%
123602
 
0.3%
Other values (562)564
89.1%
ValueCountFrequency (%)
-528661
 
0.2%
019
3.0%
1981
 
0.2%
2971
 
0.2%
4941
 
0.2%
ValueCountFrequency (%)
13159161
0.2%
12083791
0.2%
7854281
0.2%
7799351
0.2%
6969791
0.2%

Digital Activation Count - CF
Real number (ℝ≥0)

ZEROS

Distinct33
Distinct (%)5.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.903633491
Minimum0
Maximum49
Zeros85
Zeros (%)13.4%
Memory size5.1 KiB
2021-02-01T13:15:15.663830image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q36
95-th percentile15
Maximum49
Range49
Interquartile range (IQR)5

Descriptive statistics

Standard deviation5.82567452
Coefficient of variation (CV)1.188032207
Kurtosis13.58076572
Mean4.903633491
Median Absolute Deviation (MAD)2
Skewness3.017197276
Sum3104
Variance33.93848361
MonotocityNot monotonic
2021-02-01T13:15:15.779519image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
191
14.4%
288
13.9%
085
13.4%
368
10.7%
460
9.5%
647
7.4%
546
7.3%
728
 
4.4%
822
 
3.5%
917
 
2.7%
Other values (23)81
12.8%
ValueCountFrequency (%)
085
13.4%
191
14.4%
288
13.9%
368
10.7%
460
9.5%
ValueCountFrequency (%)
491
0.2%
431
0.2%
401
0.2%
371
0.2%
361
0.2%

Interactions

2021-02-01T13:15:08.139965image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:08.260641image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:08.376332image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:08.489028image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:08.603749image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:08.721436image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:08.838092image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:08.956798image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:09.066509image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:09.179178image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:09.306836image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:09.423554image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:09.533230image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:09.640941image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:09.746658image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:09.845392image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:09.948150image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:10.051839image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:10.153595image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:10.261310image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:10.367019image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:10.465768image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:10.568455image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:10.675198image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:10.775899image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:10.889595image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:11.003290image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:11.112996image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:11.217744image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:11.329416image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:11.442144image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:11.653572image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:11.767273image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:11.876977image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:11.988678image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:12.100379image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:12.214046image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:12.345694image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:12.470360image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:12.587046image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:12.690769image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2021-02-01T13:15:12.799477image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Correlations

2021-02-01T13:15:15.877256image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-02-01T13:15:16.037798image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-02-01T13:15:16.198405image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-02-01T13:15:16.362927image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-02-01T13:15:16.514551image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-02-01T13:15:12.985978image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
A simple visualization of nullity by column.
2021-02-01T13:15:13.201399image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

Emp CodeEmp NameDesignationRegionNew CustomersNew/Existing Customers'Inc. Gross SalesGross LoadGross Load AdjustedDigital Activation Count - CF
01097Shahid MustafaSales ManagerCentral192246070446129841613159167
18214Hafiz Muhammad Omer ZiaRelationship ManagerCentral6842595993119337912083796
2914Mohsin ArifSales ManagerNorth23251789057704287854286
36895Agha Umaid RazaAssistant Sales ManagerNorth23250745507699357799354
45805Anum TariqSales ManagerNorth25228772146944796969791
56829Intikhab AhmadSales ManagerCentral11132655188166628169628112
61113Waqar KhanAssistant Sales ManagerNorth04161160935574945674944
76979Muhammad UsmanAssistant Sales ManagerNorth562344935046919950169913
88121Waleed Bin TariqRelationship ManagerCentral44158348144576054676054
96467Amna ZameerRelationship ManagerNorth23141085694280774280770

Last rows

Emp CodeEmp NameDesignationRegionNew CustomersNew/Existing Customers'Inc. Gross SalesGross LoadGross Load AdjustedDigital Activation Count - CF
6238117Muhammad Arif KhanRelationship ManagerSouth000000
6248252Mian Muhammad Yasir SaleemAssistant Sales ManagerCentral000000
6258268Jamal UddinRelationship ManagerSouth000000
6268288Muhammad HumairRelationship ManagerSouth11100000000
6278336Razia NawabRelationship ManagerNorth000000
6288353Muhammad Faizan Khurshid AbbasiRelationship ManagerNorth000000
6298357Saad Bin GhaniRelationship ManagerSouth000000
6308397Mehnaz HameedRelationship ManagerMultan116000000000
6318460Tahseen KhanAssistant Sales ManagerSouth11500000000
632772Muhammad Imran QayyumSales ManagerCentral03779666-65366-528665